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1. IuPYOo Tor eoy 

SOLAR AQVEPYTnE 

This nection serves 16 4 Gommon preface to Pach ot the wer's 
guides Qeeeribing the SOLAR files. -It outlines (he yosls of SOLAR 1n4 
the relaticnshir of each Fils to those uvoals. Tr en’ with a list of 
the docnments describing SOLAR. 

SOLAR is interded to uravide easy access *9 a larqe variety of 
capantic Jata pertaining to 4 selecte] set of English words. Data have 


hoon collegte! #n cate on apdont 1,009 5° words, i.e¢., words found in 


wr. 


my 


: 


Hh 


al he Jeviecns of the Sneecl Indorstindisag =esearct sroups heing sronsored 
3 

i 

i Hynes C1 Mach of fle selsnt princisel SCLAS SP Lee “eonbainis semant iG 

a 


THT 


Hime wn aiferant vee. “wo le Gass £2 Lien! aGU Lite nine oly thp 
bs 
i 
F ACGhives maeword inneowerid ep hitlicenmruiys. 72 
i 
(1) The File of semantic analyses consists of formal descrintions 


of word meanings, orimarily those descrirtfions atven in papers written 
hy Linvirists, philosophers, and corpyter scientists. Whatever 
infothation “he anthor presents en Ssnch fepics 8S orediea te-arimnent 


reYRbicns, setintic co@nonsnts, présiveositions, WWE 7or ph*ail@enat is 


SRRere rick. Ta sear eget lif isieinie (aaa infetenl BYpmerTOTe hy 


the anthor are jselude! 345 yre e@ritiziss S| hake 220 rea hon WV other 


3 


CM Although the words for which Jata iS cnrcea*}v being collacted 
all come from the levicons being unset hy toe s7P projetts at 
Larneaie-¢e} len Whiversitv, tol’ Parinek aid Yevpan, ani System 
NBveleprent Cecreration, we arp willing to peteart and collect date on 
wehef sor’ seats alsa, 

(2)@5 Vien to gecknowledae Joha Mnev's centributions to ¢he 
jig: he vom Marwoly roesoogeiinhe Bot, Hay Orluie denis of SOT Nias 
mee forth jn Sailer Wi Yiesv 11970) and mhatinnbs|@o 4e > soonsihle for 


hee cer ai stion of intea;me lye Ssig@mariM@e ot canceagivial anh lv seas < 


NAT NT 


ST ERT OFT aT TE tf 
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(2) 2 Se@conh tile provides 4 cone pee ALgGest of the tjugretigs} 


hackyrcyun? of eack semantic analysts. The author's thearetical 


orientation, his assumptions, ard bis notational conventions are 


(3) "8xplanatory notes for the semantic components used in the 
semantic analvses are enterci into a third File. Thess notes explain as 
precisely as possible the conceptual content oach author evidently 
intends his compenont(s) to have. Tacludel in the tile are any comments 
en the author's use of components that the SOTAP brildors have deomed 
Aspropriate, 

gf) A file of gongeptua,-apalys contains intoarative surmarties 
oF tne bant Qualyss4 found in the regey+ Litersvtare af analytic 
philosonty and artificial intelliqence for partreslir notions, primarily 


those coincidina #ith ac entleriving Fhe somantic conmvonents ontered tn 


wis pipnares F 2 1S 
(5) \ collocationy) foature tile contains, for SR words, the 


f@Eihitions frod Yebster's Seayenth New Coll: 


So a ee ~ -_ - — 


gnich a snuhiect label, a xs oteanthetic vhrasa, a usade ante, or a verbal 


Lldustration appears. ach of these elenents cnonlies same indication 


sf the vorcts of wor? clasacs cermissible in the immediate context of a 


{5) @ Semegiis tf. 


——— 


12 filet) will provide 4a series of displays 


BHO’ing Mest of the cther words in *he Snolish vocahuliry *nhet stand an 


a morpholesical, definitional, synonvaitive, antonviitive, or thesanral 


CMehe structure of this file and proceiures for creating it hive 
heen srecified in detatl; however, Goding Aac net vet hequn op the 
several vio irars needel, 
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relitionshin to a qiven word. Such relatiotshina vill ba fachine 


derived frem the HW? transecipts, a partial trunsecist of Yokster's New 


Digticnars 9 Synonyms, ani a thesaurus transerint (hopetully the 


trasseript of Poget's Tnarercaational Thasautus heing pranared tv Sally 


——— a = ee we ee Se eee ee —— 


Semelow at the Iniversity of Fansas), 


Cy 4 file of Jefinitional expansions €*) will indicate the extent 
and nature of the semantic connectedness amond words in a varticular 
levicon. Por pach wor? in a given laxicon, a ¢ievlavy will he provided 
Of all tse wotds in that Lexicon that can ba reachod bv Following 47 


w 


definitional Links outward? to tun leyele nf remoteness froa that word. 


[Uo Aa SOV oRO ne =e pr CoN bet 1 PRTC') Mi beevili, when completes, 
Congin Faerrcseptative contaxsts of a given wed! occurrences tm the 
mi}]ios-wor? Srown Cornus, tho 1.7% pillion-word coenus of a7 

cinitionms, an? Vialoques collected hv *he speech understanding aranos. 
Ure sist ofetie sibc lew mtarg (dick Mi a word ingex, which lists 


ali #he watis aprearing ia the speech understanding levicoas, the 
lexicons they anyorr in, the par’ s ar speech given For @ach word ip tho 
Jevicon together vita their corresnoniing parts of speech in E77, ard the 
Syuer oF BSLAR Nfs Aveglebi® for Sich word, 

t biblicgrarhy tile prevides citatinrs to the tachnieal docnmants 
bn linmistics, chilosophy, so? computer science that are referenced in 


ether SOLA? filee or piv te af interest 49 researchers in natural 


layhanage vrocensing. 


fsyaAtth wah c-his File Wh4 fot 7B! o abn orotnced, tye Prograas noedad 


LGR egieaey Tiare 
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Preceding page blank 
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In cchelarly lexicography, vhe vrimary basis on which the meaning 
of a word is analyzed into secarate senses is a balanced collection of 
contexts of actual occurrences of the vord in rany different types of 
writings since the corp inion SOLA files provide information on only 
che, or at mast, a few of the senses of a qiven word, the &4ITc file adds 
A measure «yf A es Which broadens Phe utility of the archiva. By 
yrovidinug the user with reoresentstive samnles or the constructions in 
which varticilar voras ancvear, we acaure hia a better chance of sot* lina 
urean a senso analysis “or 1 particular loxies) domain which is 
cespatihle with an extension of the Tomain, when that occurs, 

Yo agsure maxicun goneraility while providing data specifically 
tc late{ ta the ARPA Ste projects, we beave included contexts extracted 
from both written and spokten sources, The Provn Corpys provides a 
RA Ratareéea sonrce of writhan.contdxesi0') She Ww? dREini ions BP ce a 
wnique conteisution beraus> the acecurrences of words in its definitions 
are contort rally urepefislizes to 4a fir ereater extent than those in the 
Prown Corps af ip any other tamnle of norm l weiting.€§) Sunplemeatina 


these general sources of written conteyte are ths transeripts of tho 


Pree | Se pee rol & Ber? ibs to fhe APG ayi2 COP Oriviees tie. 


a 
ee et ee ee we Oe ee ee me —— 
C5)@,n fro¥n COPATE Ae Ains » Billion vorla oftext, as oxtragtrml 
from 359% soures¢ rerreaenting 19 axnres of writing. Tt covers the sane 


- pee termbteing ame che £1 bev ot~ 10 miflion contayé¢s usel in vrenaring 
fee er tes Unaboiatqed. Dareai in wh the eonetruction and ecantents of the 
Srown Corpus are to Sa forn? in Yocers: and Srapcis (7947) and in the 
Manual of Yrforssationnr (19584). 

ea Gore thar 1,209,999 contexts can he oxytracted from the W7 
lefinitione:; however, over 4 third of these consist of conterts of 


highly frequent function words, 


ee ns 


‘ul 
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Having such contexts of occurrence at his fingertins will benetit 
the archive user by suppletenting his Linguistic intuition as to details 
of the 'saqe of a word, as in standard leyicoqraphic practice. 


Par lesiar of this file has been affected in several ways by onc 


pret ian 1 yb + he Mirect Uy ACCES] : to ae »retyers wa + AYR a 
net ware. : 7. tie: io : : : 3 ' la } tee 4174 . on 
Mahade@ent syarem to assure that +h fig Se eS ars} learning; tha 


file structur> and data managerent rrotocols will be minimal, Second, 
the len, «u of centered word plus contexts was constrained to a single 
terminal cutput Line to facilitato reading. Third, tho number of lines 
7 


cf context per centered-word wis limited to 109 (for the Prown Corpus 


int 37 ). No stop list wis used, however, so contey+s are inelnde@ for 
V1 SUR wocts (inclutins Sanction words) which have a match in the “rown 
one. oO Phe #7 Ver ind bait 


;, as the case mav be 


' 
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oe 


. DEFINTTION CP FTPERS 


There ore 12 Elelds in which data can ke ontared, Fach wit he 


Aisensses’.i: turn in teqar! to the type and format of the data allovweil. 


Tho word for which cuntexts are entered the centore] word) is, for 
the Brewn and “5? contexts, tae canonical, wninflected OLA? word as 


+ 3 


found? in the comnanion sor! Inaex tile, (All inflected forms apnearing 
in the 572% lexicens aro Linked to their eorresnordind cinoni sal fors in 
ery Tat tats Las) 

The words im the STR protocols have no such ¢dufhectional 
limitation. “verv word ayrearingd in the nrortocols is ineluded. Further, 


fo Llinitation is placed an thea number of contexts Lachlided, (The 


protocols do not exceed 229 sentences in Length, a0 4h -xcesstve number 


Of CcyEoPXE®S Fe fok a "consiven.) 
bee BOP ALE: 


oe 8 syn + + hes AEDS 30 <saai ts #S ta ty-317 Payee try 85] yn ge ys 7 aSsSi= af tia 


lexicons they are enrrontly wortind wits, we hayes enteccd here the 
feume (5) 57 the Se lexi tonsa containiny: tie centieroall vor!) of ore of 1S 
inf ls ‘taj forns, 99h pane i4 entero wit! 4 eanyarate "joeld identifier 
md is searchable via vaver*ed 1A y1e8--. vw Jist of niaes currently 
Sem love “iz:0i: follarves°”) 


we me we we we we oe we ae ae ee ee we 


fTarhnedie=“*ollop inpivercvty S92 profeet: BAN 
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ETNA TM MOTTA 


2p (Small Associated Press Release --cC4l) 
A'’SYS (Larae Associated Press fFeleasge--CMN) 
eness (Chess Playing Lexicon--C4) 

ent (Second Chess Playing Lexicon--cm!ll) 
DFSCAY, (fesk Calculator Lewicon--C¥4N) 
poctTo?® (Medical YTexiconu--CMN) 

PACELFET (Warships Lex icon--SNC) 

SMALT WORD (Core of Lunar Rocks Loxicon--P3N) 
SURS (Subset of Jane's Fiadhting Shins--5 DC) 
TRAVEL (Pusiness Travel Lexicon--RBN) 

VOCAB (Pump ard Faucet Yevair Lexicon--sRT) 
WORDS (Expansion of S¥ALLYORD--83N) 


Additional lexico.s can he added as reuguested by users, 


The contexts extracted From the Brown Corpus are taken from 500 
samples of text. This fi 14 identifies the sample, using the coles as 
liven in the machine transcripts, Yor detailed tnform:tion about the 


corstitution and source of each sample, the reader shoul? consilt the 


Yannal of Information (1954). 
Gi eel © 2 kel? 


“ach context apnearing in the frown Corpus is identified as to its 
rofition within 9 sagplo bv a line ni@her., de have included that 


igentifyine fourediat® nimber hers, 


The centered vord aw! its surcoinding contex+ are found here. As 


whlained in section 5, DATA COLLECTION, this is a 61 character 


refers to the Aqlt Aeranok and Yewnan SUP nradects Soc refers to the SP 


praject a* System Development Corporaticn: S&T refers to tle Stanford 
Yesairch Irstitute vroiect currently perovitdine Jirvect support to the Bbe 


302 crodect. 
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syubatring of the 195 characters of context provided in the originar 


For those contexts extracted trom the protosals collected by the 
SUE erniectn, we enter here the data hy whick the protocol is 


ijJantified. 


Sie Sentence THs: 
wee ehh eoe wos 


she yniane namber ascsianed te each centence in a diver Sm 


protocols isn include? herein, 


ene=sontae jes Tyres 
Since orotoceols generilly reorezent the interaction of a user with 


® (Seanlate1) campyter, it 13 ysefy) to paesntifv the peaqnatiecs of a 


Vien contey4. Thi 


wi 
ot 

“ 
u 

i) 

wad 
id 

2 

on 
x 
i 


a sontev? eubstriny to > tagged as 
cosin! “rem a user's jaterroustive, imperative, oF leclavtative input, 3 
’ 


yserts naranthetic comment, the copntter's resynonse, 97 4 sonitor's 


File 2 mes 25s 
ey nantored yord ant sicrourdinr conteryt are entared) here, all 


manAtov's yonsa ing 1m tae argtocaol aro entoypods howovear, anly 52 


Spat hGeesss jie Sent Pe per coprest. 
Po Bate “bay: 


Lie a al 


igi ua io au il 


PORATION 
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Me 


inclnded the main entry word whose defiritien contains the canterad 


word. 


The hemcgrir number totether with Lhe cense ant subsense tetters 
an? naabers of the definition that contains *he contered word are qiven 


her>. 


B7 Context: 
This field contains t= centered vor’? and its contaxt as “found 
within a sarticular sense aefinition. As with the other contexts, a 


Maxinun of 62 characters is permitted. 


META 


‘al 
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4, DATA RETRTE VAL 
Tha information in the &#IC file is availarle in two moles: via 
on-line queries to the SOLA® data management svstom over the ARPA 


Network and by listings distribu*ed by the SOAR staff. 


Ull GN=LINE ACCESS 

Al] SOLAP Files resido in the SDC SOLA” data management svstem, (8) 
Since the system is self-documenting and exceptionally user-oriented, 
ont anuidanes here in the uns of the syvster is auite senaral. 

The SOLAFP Sata nanagemert svstem recides within the C7S 
time-sharing system rnaning on an TBM 3707105 at SDC. CS is accessible 
through thea ABPA Network via either TEINET or TIP connections. 

(1) €o connect to S20 CHS via a TIP, take sure your terfhinal is 
fet to full dunlew and tyne: 

2° ¢SP> A <5B> 1. <cCR> 'transmi* on Linefeed? 

a? €3D>% © gop ‘ley to hast $9 fps)! 


The response to von shauld kes 


DEVE ‘TIP cava you are nov connected? 
Spf 2707185 Hewat "SD net sat 
Y4-370 CNLINF "Shc +ige-sharina maa 

. ‘perio? js the lowin froapt? 


C#)Phe SOLAS data Banigement syster Ins come into @yistence largely 
Heoanse ot the celflese, dilijent, and comnetent #ork of Boy Cater, 


Bheadug: bie eficrte thas 3vates eae Gade comnhtitle et! a the £85 
the initial compilations wers accomplished. 


1Prpgely oF bis tine Sud Bypertise. 


ii 


ec 


ha 


ke asked to siyn one 


LRTARAS#EN ¢ce¥>, This vill Vicit the fallowt 


explanation of the yvariows citecorics of 


ban TEGARY 


He ee ee ae 


Pap 


SOLA + f 


Syu JOM AT SH 


REQSY SANVLS 


tT! 
mn y Vo yr 
=a1e s7oTracal 


teer 


’ 
TES 


oat hed 


POAT UTE 


Jado Svar 


Dats tear 


FREE ih 
sypr 


TENT bf Gor 
Deg tipo. lal 
17 WATIW ETRY 


t: 
7 ATR 


a 
#3 


wD 


tearrme, % TH 
RTaY 


ciyRsryr 


contezts of tnterest 


y Accersed tha 


throughout vour interaction with SOLArE will be a hvphen [-) 


trl 


Leta) Veil 


Va Em ORGS gee 


At thie vairt c¥S is expecting you to loadin. 
(2) To losin, tvpe: POSTN SOLAR cOR>, SoraPp will then print some 
sifn-on messages and take care of ronntinag disk packs (14 necessary). 


The siqual for your 


eolusn 1, 2leace wit for that proapt hefore tyving, Terrinal inout 
way be either uprer caso, lower case, or a mixture, 

1%} 4‘S obtath an in*roflfiction to the SOLAR Os, HEk tor the 

forfat when aiven that option. Cr, -‘ype: “EXPLAIN SUZMARY" 

<> —(with quotes). SOvn2 il] Phih ative you & href ind Of Seacchiag 
ana printing rrocedures, coanand ranes, ani program nessides, 

() ‘To access the EFIC File, °©9) type: “FITZ KHICc"™ <CR>. 
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analvé (WP) <CR>. The search terms must he eaterel unrunctuated. Tho 
#=sian stands for an indJeterainate stripy of characters, The category 
in rarentheses Jlirits the scarch +o that single Fiel’é, 
A search can also be mato of the nor-indered t1°1485 using the 
STPINGSFAPCH facility. Tyo> “SX PLATN STRINGSEAQCHY cCR> For details. 
(7) To print Sata once contexts have been selectcd, von can use 
one of the following special orint formats: 
: COMMAND EIFTDS REMIPNED 
NDRIVT ANN SLAP Yord and Brown Contexts 
"OPT IT SUR" SCLA® “ord and SR contexts 
HORTET BI STAR Yord and €7 Contos 
2 Tt is vxlec noseibtle to tattlar yore vrint conmands, mya "YPLATN DPINT 
: <ce> for details. 
| (8) TH halt printout cf dataeoh weur tersinal, fbf the Arbak key 
; SiG and wakt for the s0LA? prommt (©). Then typer UP <Ch> (halls 
; 
i Vouns!). |hhea prowrted “anion, hit <CR yp wart SCIARSyvilleass tor your next 
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Fy 
| My To switch *9 anothe: data Fite, tynes ETLE <7MAMED" COPD, 
i 
E Zar ¢ vag pTe EG: crwynrnye Eyes Try Yseort aim the #jles available, 
Rue te es Bit <em> 
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at 
2 U9 cCeMungar rrsfixges 
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reqnesk them from Tim Diller. The user is advised, bowever, that the 


en~lins version is Likely to be more current than the printouts, which 


will be produced only at intervals of siaqnificant accretion, 
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Olney and 2aasey (1972)). The reformatting of these contexts for input 
to the SOLAR 94S will be undertaken following a period of evaluation of 
the ntilitv of the Brown contexts. 

PFaqyr S92 vrotoceals ace currently in band and being readied for 
inclusion in the archive. Two protocols were colleste!l for the snc Ssye 
project by Narhara Dentsch of SPI. Thev consist of about 129 queries 
roced Auring an exrerinent at the Naval Postoaraduate Sehoor in 
Monterrey, California. A member of the faculty at the school simulated 
the computer using a ecopvy of the SNC snhmarine daty base takan troa 


Jane's “jghting Shins. Two naval officers were aseigned tisks that 


s 
— — mee ee 


reguired thom #9 fill out three charts ane solve two problems, all 
reaquirin) inforsatien abow Bpecifications ond perfotuance 
chacacteristice of submarines in the (.5., S viet, and British flomtas 


Om egh jact @agia le 2ytowman t arth connpnter evnerpence hat no vx nerience 


on cuomarines, The other wat a lieutenant with several years exrertence 


om nvclear sukmarincs an! ne compyter bhacktrannge Phese dpi erences are 
refjoertad jp tye amestians urked oy the *va suhderts, 
re al protoere!l vas chased on the samo task domain pik vas 


Chlvect®@l #° sre, fhe Bubjgctl vas kKeuplodemable in computers but 


unfa@iliar with submarines, 


The fomrt’ PPHtocol Ghnsithe of JON Gert@rese eGligcked hy the cH 


Sie peoiect., PEGs iprebotol Wittere Troe the Cthe fs inly ie betins 


Slap) at ae llest jinn Os Sant anees ehyct com) be uiored saves Cyl 
Asshci ited Pfess Pcolewie Fats base. Phere is no cortinnity of thoutht 
feo’ of anesticn 9 the 970" of fro@ 46 answer *o the nei! aneueh io: 
ie be aoe ne ' tame) or ':aal! eegnining sera than & single 
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AMP 


ON-LINE INTERACTION 


farm> 


iska for first search stiteteat or conmand> 
hooses to access KYTC File> 
1? “£97C DATARAS *, 

ises for search statemont or command> 

tee For ontry baving tqetto as searchable 
inliecates there fe one such antrv> 

ists for next seirch statement or command> 
ommands print af Yrewnr contexts> 

BONLD GPT ONE aAnrrryowar Pow? FOR Fach & 
eee sta. Yom iG Hee 1) HEAR A Apne c ONT TORS 
‘rere Ee ORLY BiEG. (ieee PRS COULD 
ne . Wats en A AT ee ee soe IE le 
te ie 0 mE INFORM AT TOM Ney 7) MPI PTaNDInG 
ira my cpr eT? Tats gt SL LTE f AR RIA ule JN TY 
avy rea a etic BNET? aby een Yrs 

407 mry at re ral sal fe apes zg wTTH &) RBS qe 

eet, a fice Pete 2HE RSS GOW Al nosy S'CH A 
WOT GePT, HE POST FOP TACR 19,809,909 rn 
Lave; Gimp PRSTHIERS POR « SPeCTAL SFSSION 
Sethe Aju se Terai HY eT Bey TRIS) ae VACNER SAID 
Le TO Ger 7 us CHILES ce wR Pati to TUF DATE 
N27 Ah GET UE ONTCE PESPOWSS PLACE<% 
uf eke Ger Thet es A E Tet in ee. 

GET "FE Pret OF 17 THEPP**AS NOT 

cam TO CF) pe Beas PRACTICE EELPS VON 

1 *ho break Key user tells C¥S to halt typ 
nowy Were Te, uses responds wypth KC D> 

yr newt saver etatement oar ssamand> 

lange? ant> 


